chemical feature
Exploiting Hierarchical Interactions for Protein Surface Learning
Lin, Yiqun, Pan, Liang, Li, Yi, Liu, Ziwei, Li, Xiaomeng
Predicting interactions between proteins is one of the most important yet challenging problems in structural bioinformatics. Intrinsically, potential function sites in protein surfaces are determined by both geometric and chemical features. However, existing works only consider handcrafted or individually learned chemical features from the atom type and extract geometric features independently. Here, we identify two key properties of effective protein surface learning: 1) relationship among atoms: atoms are linked with each other by covalent bonds to form biomolecules instead of appearing alone, leading to the significance of modeling the relationship among atoms in chemical feature learning. 2) hierarchical feature interaction: the neighboring residue effect validates the significance of hierarchical feature interaction among atoms and between surface points and atoms (or residues). In this paper, we present a principled framework based on deep learning techniques, namely Hierarchical Chemical and Geometric Feature Interaction Network (HCGNet), for protein surface analysis by bridging chemical and geometric features with hierarchical interactions. Extensive experiments demonstrate that our method outperforms the prior state-of-the-art method by 2.3% in site prediction task and 3.2% in interaction matching task, respectively. Our code is available at https://github.com/xmed-lab/HCGNet.
Learning Harmonic Molecular Representations on Riemannian Manifold
Wang, Yiqun, Shen, Yuning, Chen, Shi, Wang, Lihao, Ye, Fei, Zhou, Hao
Molecular representation learning plays a crucial role in AI-assisted drug discovery research. Encoding 3D molecular structures through Euclidean neural networks has become the prevailing method in the geometric deep learning community. However, the equivariance constraints and message passing in Euclidean space may limit the network expressive power. In this work, we propose a Harmonic Molecular Representation learning (HMR) framework, which represents a molecule using the Laplace-Beltrami eigenfunctions of its molecular surface. HMR offers a multi-resolution representation of molecular geometric and chemical features on 2D Riemannian manifold. We also introduce a harmonic message passing method to realize efficient spectral message passing over the surface manifold for better molecular encoding. Our proposed method shows comparable predictive power to current models in small molecule property prediction, and outperforms the state-of-the-art deep learning models for ligand-binding protein pocket classification and the rigid protein docking challenge, demonstrating its versatility in molecular representation learning.
Using Artificial Intelligence to Smell the Roses - Neuroscience News
Summary: New artificial intelligence technology can accurately predict how any chemical is going to smell to humans. A pair of researchers at the University of California, Riverside, has used machine learning to understand what a chemical smells like -- a research breakthrough with potential applications in the food flavor and fragrance industries. "We now can use artificial intelligence to predict how any chemical is going to smell to humans," said Anandasankar Ray, a professor of molecular, cell and systems biology, and the senior author of the study that appears in iScience. "Chemicals that are toxic or harsh in, say, flavors, cosmetics, or household products can be replaced with natural, softer, and safer chemicals." Humans sense odors when some of their nearly 400 odorant receptors, or ORs, are activated in the nose.
Using artificial intelligence to smell the roses
A pair of researchers at the University of California, Riverside, has used machine learning to understand what a chemical smells like--a research breakthrough with potential applications in the food flavor and fragrance industries. "We now can use artificial intelligence to predict how any chemical is going to smell to humans," said Anandasankar Ray, a professor of molecular, cell and systems biology, and the senior author of the study that appears in iScience. "Chemicals that are toxic or harsh in, say, flavors, cosmetics, or household products can be replaced with natural, softer, and safer chemicals." Humans sense odors when some of their nearly 400 odorant receptors, or ORs, are activated in the nose. Each OR is activated by a unique set of chemicals; together, the large OR family can detect a vast chemical space.
Using artificial intelligence to smell the roses
IMAGE: Anandasankar Ray is a professor of molecular, cell and systems biology at UC Riverside. "We now can use artificial intelligence to predict how any chemical is going to smell to humans," said Anandasankar Ray, a professor of molecular, cell and systems biology, and the senior author of the study that appears in iScience. "Chemicals that are toxic or harsh in, say, flavors, cosmetics, or household products can be replaced with natural, softer, and safer chemicals." Humans sense odors when some of their nearly 400 odorant receptors, or ORs, are activated in the nose. Each OR is activated by a unique set of chemicals; together, the large OR family can detect a vast chemical space.
Using artificial intelligence to smell the roses
A pair of researchers at the University of California, Riverside, has used machine learning to understand what a chemical smells like -- a research breakthrough with potential applications in the food flavor and fragrance industries. "We now can use artificial intelligence to predict how any chemical is going to smell to humans," said Anandasankar Ray, a professor of molecular, cell and systems biology, and the senior author of the study that appears in iScience. "Chemicals that are toxic or harsh in, say, flavors, cosmetics, or household products can be replaced with natural, softer, and safer chemicals." Humans sense odors when some of their nearly 400 odorant receptors, or ORs, are activated in the nose. Each OR is activated by a unique set of chemicals; together, the large OR family can detect a vast chemical space.
Practical Graph Neural Networks for Molecular Machine Learning
Chemical fingerprints [1] have long been the representation used to represent chemical structures as numbers, which are suitable inputs to machine learning models. A brief summary of chemical fingerprints is provided in another of my blog posts here. Above, we computed the fingerprint for Atorvastatin, a drug which generated over $100B in revenue over 2003–2013. At some point a few years ago, people started to realize [3] that instead of computing a non-differentiable fingerprint, we can compute a differentiable fingerprint. Then, by backpropagation, we can train not only a deep-learning model but also train the fingerprint-generating function itself. The promise would be to learn richer molecular representations.
Response to Comment on "Predicting reaction performance in C-N cross-coupling using machine learning"
We demonstrate that the chemical-feature model described in our original paper is distinguishable from the nongeneralizable models introduced by Chuang and Keiser. Furthermore, the chemical-feature model significantly outperforms these models in out-of-sample predictions, justifying the use of chemical featurization from which machine learning models can extract meaningful patterns in the dataset, as originally described. In Ahneman et al. (1), we showed that a random forest (RF) algorithm built using computationally derived chemical descriptors for the components of a Pd-catalyzed C–N cross-coupling reaction (aryl halide, ligand, base, and potentially inhibitory isoxazole additive) could identify predictive and meaningful relationships in a multidimensional chemical dataset comprising 4608 reactions. Chuang and Keiser (2) built alternative models using random barcode features ("straw" models), wherein the chemical descriptors are replaced with random numbers selected from a standard normal distribution. One-hot encoded features, wherein each reagent acts as a categorical descriptor and is marked as absent or present, were also evaluated.
Comment on "Predicting reaction performance in C-N cross-coupling using machine learning"
Ahneman et al. (Reports, 13 April 2018) applied machine learning models to predict C–N cross-coupling reaction yields. The models use atomic, electronic, and vibrational descriptors as input features. However, the experimental design is insufficient to distinguish models trained on chemical features from those trained solely on random-valued features in retrospective and prospective test scenarios, thus failing classical controls in machine learning. A recent report by Ahneman et al. (1) describes a machine learning approach for modeling chemical reactions with data collected through ultrahigh-throughput experimentation. The Buchwald-Hartwig coupling (2) is used as a model reaction, with a Glorius interference approach (3) to study reaction poisoning by isoxazole additives. Reactions are represented by atomic, electronic, and vibrational descriptors that are automatically calculated through a new computational pipeline.